16 Support Vector Machines
Area of Study 3: Computer science: past and present
Learning Intentions
Key knowledge
- the concept of training algorithms using data
- the concepts of model overfitting and underfitting
- support vector machines (SVM) as margin-maximising linear classifiers, including:
- the geometric interpretation of applying SVM binary classification to one- or two-dimensional data
- the creation of a second feature from one-dimensional data to allow linear classification
Key skills
- explain, at a high level, how data-driven algorithms can learn from data
- explain the optimisation objectives for training SVM and neural network binary classifiers
- explain how higher dimensional data can be created to allow for linear classification
Machine Learning Algorithms
A machine learning algorithm is a procedure that allows a computer to improve its performance at a task by learning from data, rather than being given only explicit, hand-coded instructions.
- It takes examples (data) as input.
- It uses a model to find patterns or rules in that data.
- It can then make predictions or decisions on new, unseen inputs.
Machine Learning Algorithms
Traditional algorithms: every step is written out by a programmer.
Machine learning algorithms: the computer adjusts its own internal rules (parameters) automatically, based on training data.
Examples:
- Neural network – adjusts weights between “neurons” to recognise patterns.
- Support vector machine (SVM) – finds the best boundary (hyperplane) to separate categories.
The machine can adjust its own parameters but it does not create them.
Support Vector Machines (SVMs)
- A support vector machine (SVM) is a supervised machine learning algorithm.
- Its main purpose is classification, especially binary classification
- Email filtering (spam / not spam).
- Image recognition (cat / not cat).
- Medical diagnostics (disease / no disease).
Training the SVM
Training involves comparing a large set of preclassified vectors.
Bias and Variance in Classification
Two types of errors when classifying data:
Bias - underfitting 🎯
- Analogy: arrows clustered together but far from the bullseye
- Comes from a too simple model
- Misses the real patterns
- Leads to systematic error (underfitting)
Variance - overfitting 🎯
- Analogy: arrows scattered widely around the target
- Comes from a too complex model
- Fits the noise as well as the signal
- Leads to unreliable predictions (overfitting)
The Trade-off
- High bias → underfitting
- High variance → overfitting
- Goal = balance → arrows tightly grouped around the bullseye
Key Vocabulary for SVM
- Support vector – the data points that are closest to the separating boundary; they determine the position of the hyperplane.
- Margin – the distance between the separating hyperplane and the nearest support vectors; SVM maximises this.
- Hyperplane – the boundary SVM draws to separate the classes (a line in 2D, a plane in 3D, etc.).
- Bias – error caused by using a model that is too simple (underfitting).
- Variance – error caused by a model that is too complex and too sensitive to training data (overfitting).